visual communication
Learning to Draw: Emergent Communication through Sketching
Evidence that visual communication preceded written language and provided a basis for it goes back to prehistory, in forms such as cave and rock paintings depicting traces of our distant ancestors. Emergent communication research has sought to explore how agents can learn to communicate in order to collaboratively solve tasks. Existing research has focused on language, with a learned communication channel transmitting sequences of discrete tokens between the agents. In this work, we explore a visual communication channel between agents that are allowed to draw with simple strokes. Our agents are parameterised by deep neural networks, and the drawing procedure is differentiable, allowing for end-to-end training. In the framework of a referential communication game, we demonstrate that agents can not only successfully learn to communicate by drawing, but with appropriate inductive biases, can do so in a fashion that humans can interpret. We hope to encourage future research to consider visual communication as a more flexible and directly interpretable alternative of training collaborative agents.
Learning to Draw: Emergent Communication through Sketching
Evidence that visual communication preceded written language and provided a basis for it goes back to prehistory, in forms such as cave and rock paintings depicting traces of our distant ancestors. Emergent communication research has sought to explore how agents can learn to communicate in order to collaboratively solve tasks. Existing research has focused on language, with a learned communication channel transmitting sequences of discrete tokens between the agents. In this work, we explore a visual communication channel between agents that are allowed to draw with simple strokes. Our agents are parameterised by deep neural networks, and the drawing procedure is differentiable, allowing for end-to-end training.
Visuals of AI in the military domain: beyond 'killer robots' and towards better images?
In this blog post, Anna Nadibaidze explores the main themes found across common visuals of AI in the military domain. Inspired by the work and mission of Better Images of AI, she argues for the need to discuss and find alternatives to images of humanoid'killer robots'. Anna holds a PhD in Political Science from the University of Southern Denmark (SDU) and is a researcher for the AutoNorms project, based at SDU. The integration of artificial intelligence (AI) technologies into the military domain, especially weapon systems and the process of using force, has been the topic of international academic, policy, and regulatory debates for more than a decade. The visual aspect of these discussions, however, has not been analysed in depth. This is both puzzling, considering the role that images play in shaping parts of the discourses on AI in warfare, and potentially problematic, given that many of these visuals, as I explore below, misrepresent major issues at stake in the debate.
A Tiny Machine Learning Model for Point Cloud Object Classification
Zhang, Min, Xue, Jintang, Kadam, Pranav, Prajapati, Hardik, Liu, Shan, Kuo, C. -C. Jay
The design of a tiny machine learning model, which can be deployed in mobile and edge devices, for point cloud object classification is investigated in this work. To achieve this objective, we replace the multi-scale representation of a point cloud object with a single-scale representation for complexity reduction, and exploit rich 3D geometric information of a point cloud object for performance improvement. The proposed solution is named Green-PointHop due to its low computational complexity. We evaluate the performance of Green-PointHop on ModelNet40 and ScanObjectNN two datasets. Green-PointHop has a model size of 64K parameters. It demands 2.3M floating-point operations (FLOPs) to classify a ModelNet40 object of 1024 down-sampled points. Its classification performance gaps against the state-of-the-art DGCNN method are 3% and 7% for ModelNet40 and ScanObjectNN, respectively. On the other hand, the model size and inference complexity of DGCNN are 42X and 1203X of those of Green-PointHop, respectively.
Iconary: A pictionary-like game to improve the communication skills of AI agents
While artificial intelligence (AI) agents have become increasingly skilled at communicating with humans, they still struggle with several aspects of language, including complex semantics. The term semantics refers to the area of linguistics that relates to the meaning associated with specific words or logical connections between different concepts. A few years ago, researchers at Allen Institute for AI developed a game called Iconary, which is designed to improve the ability of AI techniques to communicate and make connections between different objects. In a recent paper pre-published on arXiv and presented at last year's ENMLP conference, the researchers introduced a more advanced version of the game and trained machine learning algorithms to play against each other or with humans. "Our paper is based on a project at AI2 aimed at training models to play Iconary, a Pictionary-based game we created, where a player has to guess what another player is drawing," Christopher Clark, one of the researchers who carried out the study, told TechXplore.
Dogs' brains are not hardwired to respond to human faces, study reveals
Researchers found that our furry friends' brains are not hardwired to focus on human faces, but respond with more excitement when an animal of the same species is in view. Using an MRI machine, the team monitor brain activity in both humans and dogs as they watched two-second videos that displayed dog and human faces and the backs of heads. The results from the animals showed that no part of their brains responded more to faces, but researchers note that the reason dogs pay attention to human faces is because they evolved to depend on their owners. Researchers found that our furry friends' brains are not hardwired to focus on human faces, but respond with more excitement when an animal of the same species is in view. The study was conducted by a team of Hungary- and Mexico-based researchers, who worked together to compare how dog and human brains process visual information.
Pragmatic inference and visual abstraction enable contextual flexibility during visual communication
Fan, Judith, Hawkins, Robert, Wu, Mike, Goodman, Noah
Visual modes of communication are ubiquitous in modern life. Here we investigate drawing, the most basic form of visual communication. Communicative drawing poses a core challenge for theories of how vision and social cognition interact, requiring a detailed understanding of how sensory information and social context jointly determine what information is relevant to communicate. Participants (N=192) were paired in an online environment to play a sketching-based reference game. On each trial, both participants were shown the same four objects, but in different locations. The sketcher's goal was to draw one of these objects - the target - so that the viewer could select it from the array. There were two types of trials: close, where objects belonged to the same basic-level category, and far, where objects belonged to different categories. We found that people exploited information in common ground with their partner to efficiently communicate about the target: on far trials, sketchers achieved high recognition accuracy while applying fewer strokes, using less ink, and spending less time on their drawings than on close trials. We hypothesized that humans succeed in this task by recruiting two core competencies: (1) visual abstraction, the capacity to perceive the correspondence between an object and a drawing of it; and (2) pragmatic inference, the ability to infer what information would help a viewer distinguish the target from distractors. To evaluate this hypothesis, we developed a computational model of the sketcher that embodied both competencies, instantiated as a deep convolutional neural network nested within a probabilistic program. We found that this model fit human data well and outperformed lesioned variants, providing an algorithmically explicit theory of how perception and social cognition jointly support contextual flexibility in visual communication.
Speech recognition triggers fun AR stickers in Panda's video app
Panda has built the next silly social feature Snapchat and Instagram will want to steal. Today the startup launches its video messaging app that fills the screen with augmented reality effects based on the words you speak. Say "Want to get pizza?" and a 3D pizza slice hovers by your mouth. Say "I wear my sunglasses at night" and suddenly you're wearing AR shades with a moon hung above your head. Instead of being distracted by having to pick effects out of a menu, they appear in real-time as you chat.
The Role of Emotional Intelligence in AI
For that reason, marketers are now turning to messaging platforms to improve communication channels for sales and customer service conversations. Companies like United Airlines, Pizza Hut, Denny's Diner, Focus Features, and Patrón, just to name a few, have implemented bots on social media to field customer service issues or help consumers seek information more quickly.